Weights Space Exploration Using Genetic Algorithms for Meta-classifier in Text Document Classification
نویسنده
چکیده
Automatic document classification has become an important task because of the continually increasing number of text documents with the users have to deal with. The aim of this paper is to develop a non-adaptive meta-classifier for text documents that has an increased classification accuracy. The developed meta-classifier is based on combining some SVM classifiers and a Naïve Bayes classifier. We proposed a new meta-classification method which takes into consideration the corresponding positions and confidence degrees obtained for all the classes. In this work we have tried to find, using Genetic Algorithms, the optimal weighting factors for the values returned by each classifier separately. Consequently, it is possible for the meta-classifier to select as the winner class, a class that is not hierarchized as the first one by any of the compounded classifiers. The experimental results have showed that the classification accuracy can be improved through the proposed method.
منابع مشابه
A New Approach for Text Documents Classification with Invasive Weed Optimization and Naive Bayes Classifier
With the fast increase of the documents, using Text Document Classification (TDC) methods has become a crucial matter. This paper presented a hybrid model of Invasive Weed Optimization (IWO) and Naive Bayes (NB) classifier (IWO-NB) for Feature Selection (FS) in order to reduce the big size of features space in TDC. TDC includes different actions such as text processing, feature extraction, form...
متن کاملArabic News Articles Classification Using Vectorized-Cosine Based on Seed Documents
Besides for its own merits, text classification (TC) has become a cornerstone in many applications. Work presented here is part of and a pre-requisite for a project we have overtaken to create a corpus for the Arabic text process. It is an attempt to create modules automatically that would help speed up the process of classification for any text categorization task. It also serves as a tool for...
متن کاملGenetic Algorithm based Feature Selection in High Dimensional Text Dataset Classification
Vector space model based bag-of-words language model is commonly used to represent documents in a corpus. But this representation model needs a high dimensional input feature space that has irrelevant and redundant features to represent all corpus files. Non-Redundant feature reduction of input space improves the generalization property of a classifier. In this study, we developed a new objecti...
متن کاملA New Document Embedding Method for News Classification
Abstract- Text classification is one of the main tasks of natural language processing (NLP). In this task, documents are classified into pre-defined categories. There is lots of news spreading on the web. A text classifier can categorize news automatically and this facilitates and accelerates access to the news. The first step in text classification is to represent documents in a suitable way t...
متن کاملMeta-learning Method for Automatic Selection of Algorithms for Text Classification
The paper presents a meta-learning approach for textual document classification task and an automatic selection of the best available algorithm for creation of classifiers. After brief introductory description of principles of document preprocessing, creation and evaluation of the classifiers, the metalearning approach is presented as a method for automatic selection of the most appropriate cla...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2017